VER-Net: a hybrid transfer learning model for lung cancer detection using CT scan images

Background Lung cancer is the second most common cancer worldwide, with over two million new cases per year. Early identification would allow healthcare practitioners to handle it more effectively. The advancement of computer-aided detection systems significantly impacted clinical analysis and decision-making on human disease. Towards this, machine learning and deep learning techniques are successfully being applied. Due to several advantages, transfer learning has become popular for disease detection based on image data. Methods In this work, we build a novel transfer learning model (VER-Net) by stacking three different transfer learning models to detect lung cancer using lung CT scan images. The model is trained to map the CT scan images with four lung cancer classes. Various measures, such as image preprocessing, data augmentation, and hyperparameter tuning, are taken to improve the efficacy of VER-Net. All the models are trained and evaluated using multiclass classifications chest CT images. Results The experimental results confirm that VER-Net outperformed the other eight transfer learning models compared with. VER-Net scored 91%, 92%, 91%, and 91.3% when tested for accuracy, precision, recall, and F1-score, respectively. Compared to the state-of-the-art, VER-Net has better accuracy. Conclusion VER-Net is not only effectively used for lung cancer detection but may also be useful for other diseases for which CT scan images are available.


Introduction
Lung cancer is one of the leading causes of cancer-related deaths globally.It is broadly classified as small and nonsmall-cell lung cancer [1].Lung cancer is a significant contributor to cancer-related deaths worldwide, with the highest mortality rate among all types of cancer.According to the World Health Organization 1 , cancer is a significant contributor to global mortality, resulting in approximately 10 million fatalities in 2020, which accounts for roughly one out of every six deaths.WHO estimated that one in 16 people would be diagnosed with lung cancer worldwide by 2022. Figure 1 represents the incidence cases and deaths of cancers for both sexes and all age groups worldwide 2 .The x-axis represents the number of people, while the y-axis denotes the types of cancers.Amongst all cancers, lung cancer has a significantly higher mortality rate.Additionally, when considering the number of incident cases, lung cancer ranks second among all types of cancer.
Roughly one-third of cancer-related deaths can be attributed to tobacco usage, a high body mass index, alcohol consumption, inadequate consumption of fruits and vegetables, and a lack of physical activity [2].In addition, international agencies for cancer research have identified several risk factors that contribute to the development of various cancers, including alcohol, dietary exposures, infections, obesity, radiation, and many more that contribute towards cancer diseases.Lung cancer is caused by the abnormal growth of cells that form a tumour and can have serious consequences if left untreated.Early detection and effective treatment can lead to successful cures for many forms of cancer.Also, it is crucial for improving the survival rate and reducing mortality [3].
Lung cancer is a respiratory illness that affects people of all ages.Symptoms of lung cancer include changes in 1 https://www.who.int/news-room/fact-sheets/detail/cancer. 2 https://www.iarc.who.int/.
voice, coughing, chest pain, shortness of breath, weight loss, wheezing, and other painful symptoms [4].Nonsmall-cell lung cancer has various subtypes, including Adenocarcinoma, squamous cell cancer, and large cell carcinoma, and is frequently observed [5].However, small-cell lung cancer spreads faster and is often fatal.
Over the decades, clinical pathways and pathological treatments for lung cancer have included chemotherapy, targeted drugs, and immunotherapy [6].In hospitals, doctors use different imaging techniques; while chest X-rays are the most cost-effective method of diagnosis, they require skilled radiologists to interpret the images accurately, as these can be complex and may overlap with other lung conditions [7].Various lung diagnosis methods exist in the medical industry that use CT (computed tomography), isotopes, X-rays, MRI (magnetic resonance imaging), etc. [8,9].
Manual identification of lung cancer can be a time-consuming process subject to interpretation, causing delays in diagnosis and treatment.Additionally, the severity of the disease infection may not be apparent on X-ray images.
As artificial intelligence (AI) has advanced, deep learning has become increasingly popular in analyzing medical images.Deep learning is a technique that can automatically discover high dimensionality, as compared to the more intuitive visual assessment of images that is often performed by skilled clinicians [10][11][12].Convolutional neural networks (CNNs) are promising for extracting more powerful and deeper features from these images [13].Significant improvements have been achieved in the potential to identify images and extract features inside images due to the development of CNN [14,15].Advanced CNNs have been shown to improve the accuracy of predictions significantly.In recent years, the development of computer-aided detection (CAD) has shown promising results in medical image analysis [16,17].Deep learning techniques, particularly transfer learning, have emerged as a powerful technique for leveraging pre-trained models and improving the performance of deep learning models [18].
Transfer learning has gained significant attention and success in various fields of AI, including medical image diagnosis [19], computer vision [20], natural language processing [21], speech recognition [22], and many more.Transfer learning involves using pre-trained neural networks to take the knowledge gained from one task (source task) and apply it to a different but related task (target task) [23].In transfer learning, a model pretrained on a large dataset for a specific task can be finetuned on similar datasets for different tasks.
Transfer learning has recently shown much promise in making it easier to detect lung cancer from medical imaging data.Integrating transfer learning methodologies In this paper, we employed different transfer learning models for lung cancer detection using CT images.We proposed a hybrid model to enhance the prediction capability of the pre-trained models.The key contributions of this paper are: 1.The original image dataset is resized into 460 × 460 × 3.
2. Random oversampling is applied to fuse synthetic images in the minority class.[28] experimented with a novel residual neural network with a transfer learning technique to identify pathology in lung cancer subtypes from medical images for an accurate and reliable diagnosis.The suggested model was pre-trained on the public medical image dataset luna16 and fine-tuned using their intellectual property lung cancer dataset from Shandong Provincial Hospital.Their approach accurately identifies pathological lung cancer from CT scans at 85.71%.Han et al. [29] developed a framework to assess the potential of PET/CT images in distinguishing between different histologic subtypes of non-small cell lung cancer (NSCLC).They evaluated ten feature selection techniques, ten machine learning models, and the VGG16 deep learning algorithm to construct an optimal classification model.The VGG16 achieved the highest accuracy rate of 84.1% among all the models.Vijayan et al.
[30] employed three optimizers with six deep learning models.These models included AlexNet, GoogleNet, ResNet, Inception V3, EfficientNet b0, and SqueezeNet.While evaluating the various models, their effectiveness is measured by comparing their results with a stochastic gradient, momentum, Adam, and RMSProp optimization strategies.According to the findings of their study, GoogleNet using Adam as the optimizer achieves an accuracy of 92.08%.Nóbrega et al. [31] developed the classification model using deep transfer learning based on CT scan lung images.Several feature extraction models, including VGG16, VGG19, MobileNet, Xception, InceptionV3, ResNet50, Inception-ResNet-V2, DenseNet169, DenseNet201, NASNetMobile and NASNetLarge, were utilized to analyze the Lung Image Database Consortium and Image Database Resource Initiative (LIDC/IDRI).Among all the algorithms, the CNN-ResNet50 and SVM-RBF (support vector machine-radial basis function) combination was found to be the most effective deep extractor and classifier for identifying lung nodule malignancy in chest CT images, achieving an accuracy of 88.41% and an AUC of 93.19%.The authors have calculated the other performance evaluation matrices to validate the proposed model.Dadgar & Neshat [32] proposed a novel hybrid convolutional deep transfer learning model to detect three common types of lung cancer -Squamous Cell Carcinoma (SCC), Large Cell Carcinoma (LCC), and Adenocarcinoma.The model included several pre-trained deep learning architectures, such as VGG16, ResNet152V2, MobileNetV3 (small and large), InceptionResNetV2, and EfficientNetV2, which were compared and evaluated in combination with fully connected, dropout, and batch-normalization layers, with adjustments made to the hyper-parameters.After preprocessing 1000 CT scans from a public dataset, the best-performing model was identified as Inception-ResNetV2 with transfer learning, achieving an accuracy of 91.1%, precision of 84.9%, AUC of 95.8%, and F1-score of 81.5% in classifying three types of lung cancer from normal samples.Worku et al. [33] proposed a denoising first two-path CNN (DFD-Net) for lung cancer detection.During preprocessing, a residual learning denoising model (DR-Net) is used to remove the noise.Then, a two-path convolutional neural network was used to identify lung cancer, with the denoised image from DR-Net as an input.The combined integration of local and global aspects is the main emphasis of the two pathways.Further, the performance of the model was enhanced, and a method other than the traditional feature concatenation techniques was employed, which directly integrated two sets of features from several CNN layers.Also, the authors overcame image label imbalance difficulties and achieved an accuracy of 87.8% for predicting lung cancer.Sari et al.
[34] implemented CAD system using deep learning on CT images to classify lung cancer.They used transfer learning and a modified ResNet50 architecture to classify lung cancer images into four categories.The results obtained from this modified architecture show an accuracy of 93.33%, sensitivity of 92.75%, precision of 93.75%, F1-score of 93.25%, and AUC of 0.559.The study found that the modified ResNet50 outperforms the other two architectures, EfficientNetB1 and AlexNet, in accurately classifying lung cancer images into Adenocarcinoma, large carcinoma, normal, and squamous carcinoma categories.
Overall, these studies show that transfer learning has the potential to improve how well medical imaging data can be used to find lung cancer.Using pre-trained deep neural networks can significantly reduce the need for large datasets and reduce training time, making them more accessible for clinical applications.However, more research is needed to find the best architecture for transfer learning and the best fine-tuning strategy for spotting lung cancer.Further studies can focus on improving the interpretability and generalization of transfer learning models for real-world applications.

Research methodology
The details of the requirements and experimental steps carried out in this paper are discussed in this section.

Framework
The proposed model follows seven phases of structure, as shown in Fig. 3.After acquiring the chest CT scan images, they were preprocessed and augmented to make the experiment suitable.The processed dataset is divided into training, validation, and testing sets.Eight popular transfer learning models were executed based on this data.Among them, the top three

Dataset description
The chest CT images utilized in this study were obtained from Kaggle 3 .The dataset contains CT scan images of three types of lung cancers: Adenocarcinoma, Large cell carcinoma, and Squamous cell carcinoma.During the system.Class-wise samples of lung cancer CT images are depicted in Fig. 4. The detailed distribution of the dataset in terms of the total images, number of images in each class, number of classes, and labelling in each category is elucidated in Table 1.
Adenocarcinoma Lung adenocarcinoma 4 is the most common form of lung cancer, accounting for 30% of all cases and about 40% of all non-small cell lung cancer occurrences.Adenocarcinomas are found in several common cancers, including breast, prostate and colorectal.Adenocarcinomas of the lung are found in the outer region of the lung in glands that secrete mucus and help us breathe.Symptoms include coughing, hoarseness, weight loss and weakness.
Large cell carcinoma Large-cell undifferentiated carcinoma 5 lung cancer grows and spreads quickly and can be found anywhere in the lung.This type of lung cancer usually accounts for 10 to 15% of all cases.Large-cell undifferentiated carcinoma tends to grow and spread quickly. 6s found centrally in the lung, where the larger bronchi join the trachea to the lung or in one of the main airway branches.Squamous cell lung cancer is responsible for about 30% of all non-small cell lung cancers and is generally linked to smoking.The last category is the normal CT scan images.

Data preprocessing
To develop a robust and reliable automated system, data preprocessing plays a crucial role in the modelbuilding process [35][36][37].Preprocessing is an essential step to eliminate the distortions from the images.In this study, data preprocessing, image resizing, and data augmentation were used for better classification and detection of lung cancer, as discussed in the subsections below.

Image resizing
The loaded images are standardized and normalized using a standard scaler and min-max scaler as the normalization functions.The files are resized from 224 × 224 to 460 × 460 using a resize function.The classes undergo label encoding, i.e., 0 for class Adenocarcinoma, 1 for class Large cell carcinoma, 2 for class Normal and 3 for class Squamous cell carcinoma.

Data augmentation
Random oversampling was applied afterwards to add randomly duplicate examples in the minority class by adding additional images to the classes containing fewer samples in the dataset.Initially, the dataset comprised 1000 images, with each class containing 338, 187, 260 and 215 images.The final dataset after oversampling contains 1653 images, with each class containing 411, 402, 374 and 466 images, as shown in Table 2.
After that, data augmentation was applied by applying shear_range = 0.2, zoom_range = 0.2, rotation_range = 24, horizontal_flip = True, and vertical_flip = True.Finally, the dataset is split into training, testing and validation in 64.48%, 26.98% and 8.52%, respectively.After the preprocessing followed by the Train-test split, the data is fed to models for training.

Transfer learning models
Transfer learning models play a significant role in healthcare for medical image processing [23,31].Medical imaging technologies, such as X-rays, CT scans, MRI scans, and histopathology slides, generate vast amounts of visual data that require accurate and efficient analysis.Transfer learning enables the utilization of pre-trained models trained on large datasets from various domains, such as natural images, to tackle medical image processing tasks [28].The transfer learning models that are considered in this experiment are described in this section.
where H i is the output of the current layer, f is the activation function, and [H 0 , H 1 , H 2 , …, H i−1 ] are the outputs of all previous layers concatenated together.Also, W i+1 is the set of weights for the convolutional layer, BN is the batch normalization operation, f is the activation function, and W i+1 is the output of the transition layer.
The unique advantages of DenseNet201 are: • Dense connectivity pattern between layers, allowing for feature reuse.• Reduces the vanishing gradient problem and encourages feature propagation.• Achieves high accuracy while using fewer parameters compared to other models.

MobileNet
MobileNet [38] is a popular deep neural network architecture designed for mobile and embedded devices with limited computational resources.The architecture is based on a lightweight building block called a MobileNet unit, which consists of a depth-wise separable convolution layer followed by a pointwise convolution layer.The depth-wise separable convolution is a factorized convolution that decomposes a standard convolution into a depth-wise convolution and a pointwise convolution, which reduces the number of parameters and computations.The output of a MobileNet unit and inverted residual block can be calculated using Eq. 3 to Eq. 7.
where X is the input tensor, DW is the depth-wise convolution operation, Conv 1 × 1 is the pointwise convolution operation, σ is the activation function, BN is the batch normalization operation, and Y is the output tensor.Also, X in is the input tensor, X is the output tensor of the bottleneck layer, Conv 1 × 1 and DW are the pointwise and depthwise convolution operations.The unique advantages of MobileNet are: • Specifically designed for mobile and embedded vision applications.
where x is the input to the block, F is a set of convolutional layers with weights W i , and y is the block output.The skip connection adds the input x to the output y to produce the final output of the block.The unique advantages of ResNet101 are: • Residual connections that mitigate the vanishing gradient problem.• Permits deeper network architecture without compromising performance.• It is easy to train and achieves excellent accuracy.

EfficientNetB0
EfficientNetB0 [42] is a CNN architecture belonging to the EfficientNet model family.These models are specifically crafted to achieve top-tier performance while maintaining computational efficiency, rendering them suitable for various computer vision tasks.The central concept behind EfficientNet revolves around harmonizing model depth, width, and resolution to attain optimal performance.This is achieved through a compound scaling technique that uniformly adjusts these three dimensions to generate a range of models, with EfficientNetB0 as the baseline.The network comprises 16 blocks, each characterized by its width, determined by the number of channels (filters) in every convolutional layer.The number of channels is adjusted using a scaling coefficient.Additionally, the input image resolution for EfficientNetB0 typically remains fixed at 224 × 224 pixels.
The unique advantages of EfficientNetB0 are: • Achieve state-of-the-art accuracy on image classification tasks.• Use a compound scaling method to balance model depth, width, and resolution.• A more accurate and computationally efficient architecture design.

EfficientNetB4
The EfficientB4 [43] neural network, consisting of blocks and segments, has residual units and parallel GPU utilization points.It is a part of the EfficientNet family of models, designed to be more computationally efficient than previous models while achieving state-of-the-art accuracy on various computer vision tasks, including image classification and object detection.The CNN backbone in EfficientNetB4 consists of a series of convolutional blocks, each with a set of operations, including convolution, batch normalization, and activation.The output of each block is fed into the next block as input.
The final convolutional block is followed by a set of fully connected layers responsible for classifying the input image.The output of a convolutional block can be calculated using Eq. 9.
where x i−1 is the input to the current block, W i is the set of weights for the convolutional layer, BN is the batch normalization operation, f is the activation function, and y i is the block output.
Being in the same family, EfficientB4 shares the advantages of EfficientNetB0.

VGG19
Visual Geometry Group (VGG) is a traditional CNN architecture.The VGG19 [44] model consists of 19 layers with 16 convolutional layers and three fully connected layers.The max-pooling layers are applied after every two or three convolutional layers.It has achieved high accuracy on various computer vision tasks, including image classification, object detection, and semantic segmentation.One of the main contributions of the VGG19 network is the use of very small convolution filters (3 × 3) in each layer, which allows for deeper architectures to be built with fewer parameters.The output of the convolutional layers can be calculated using Eq.10.
where x is the input image, W is the weight matrix of the convolutional layer, b is the bias term, and f is the activation function, which is usually a rectified linear unit (ReLU) in VGG19.The output y is a feature map that captures the important information from the input image.
The unique advantages of VGG19 are: • Simple and straightforward architecture.
• Achieves good performance on various computer vision tasks.• Its simplicity and ease of use make it a favourite among educators.

Proposed VER-Net model
To find out the best-performing models among the ones discussed in the previous section, we ran them and assessed their performance individually.Among them, VGG19 and EfficientNetB0 were the best performers in all metrics.However, EfficientNetB4 and ResNet101 competed with each other to take the third spot.In some metrics, EfficientNetB4 did better, while in some, ResNet101 was better.

Model Architecture
The architecture of the proposed VER-Net model is shown in Fig. 5.The input shape is 460 × 460 × 3, which is mapped to four classes as output.We used three different dense layers for three stacked transfer learning models in the model.Thereafter, the same convolution layers of 7 × 7 × 1024 for all three and three different max-pooling layers are used.The outputs are flattened before sending to three 3 fully connected layers (1024 × 512 × 256).The three outputs of these connected layers are then concatenated using majority voting, and accordingly, the classified outputs are generated.The architectural description of VER-Net is shown in Table 3.

Model parameters
The details of hyperparameters settings for VER-Net are listed in Table 4.In Table 5, the details of data augmentation are listed.Here, we used the RandomFlip and Ran-domRotation functions available in TensorFlow.Keras for data augmentation.

Experiment, results and performance analysis
In this section, the experimental details, including system setup and evaluation metrics, are covered.Also, the results are elaborately presented, and the performance of the proposed model is extensively assessed.

Experimental system setup
The experiment was conducted on a Dell workstation with a Microsoft Windows environment.Python was used to program on the Anaconda framework.The details of the system are given in Table 6.

Evaluation Metrics
Evaluation metrics are used to assess the performance of a model on a problem statement.Different evaluation metrics are used depending on the problem type and the data's nature.In this study, the experimental findings for the presented models are evaluated using various performance metrics, summarised in Table 7.

VER-Net model implementation
After background and designing the VER-Net model, we implemented it.The results are discussed in the following.

Confusion matrix
The classification performance of VER-Net is evaluated using a confusion matrix, as shown in Fig.

Accuracy and loss of VER-Net
The accuracy and loss of our VER-Net model are plotted in Figs.7 and 8, respectively.The x-axis denotes the number of epochs (100), while the y-axis reflects accuracy in      We got the best results with 100 epochs.

Performance analysis of VER-Net
In this section, we exhaustively analyze the performance of VER-Net model.For this, we adopted a comparative analysis approach.We compared VER-Net with other transfer learning models and the results of similar research works.

Comparing VER-Net with other transfer learning models
First, we compare the performance of VER-Net with the individual transfer learning models, mentioned in Sect.3.4.All the models were trained and tested on the same dataset and validated with the same parameters.Figures 9 and 10 present the accuracy and loss comparisons.VER-Net and VGG19 both achieved the highest accuracy of 97.47% for training, but for testing, VER-Net emerged as the sole highest accuracy achiever with 91%.NASNetLarge got the lowest accuracy on both occasions, with 69.51% and 64% training and testing accuracy, respectively.Similar to accuracy, VER-Net and VGG19 both managed the lowest loss of 0.07% for training, and VER-Net was the sole lowest loss achiever with 0.34%.Here also, NASNetLarge performed worst on both occasions with 0.66% and 0.80% training and testing loss, respectively.
Table 8 notes all classes' precision, recall and F1-score values to compare VER-Net with other models.The   macro average of these metrics for all four classes is shown in Fig. 11.For all three instances, i.e., precision, recall and F1-score, VER-Net outperformed with 0.920, 0.910, and 0.913, respectively.VGG19 and Eficient-NetB0 emerged as the second and third-best performers, whereas NASNetLarge was the worst performer with 0.693, 0.645, and 0.645 for precision, recall and F1-score, respectively.
In Fig. 12, VER-Net is compared with others in terms of weighted average for precision, recall and F1-score.Here, we used a uniform weight of 1.5 for all classes.Like the macro average, VER-Net was the top performer for all three metrics, followed by VGG19 and EficientNetB0, and NasNetLarge was the worst performer.As shown in Table 8, NasNetLarge classifies the non-cancerous cells with 100% accuracy; in fact, it performs the best among all models but performs very poorly for the cancerous cells.
To assess the performance variations of VER-Net, we calculated the standard deviation to calculate the meanvariance across the classes for precision, recall and F1-score.A lower value suggests that the model is effective for all classes equally.In contrast, a higher variation suggests bias to a certain class.From Fig. 13, it can be observed that VER-Net has the lowest variations for recall and F1-score of 0.062 and 0.04, respectively.However, as an exception in the case of precision, VER-Net is bettered by DenseNet201 with a margin of 0.042 variations.This can be reasoned as VER-Net attained 100% precision for the Normal class.Nevertheless, VER-Net has significantly lower variance across three metrics than DenseNet201.

Comparing VER-Net with literature
In the previous section, we established the superiority of VER-Net over other established transfer learning models.To prove the ascendency of VER-Net further, we compared it with the results of some similar recent experiments, available in the literature pertaining to detecting lung cancer based on CT scan images using transfer learning methods.A comparative summary is given in Table 9.

Fig. 1
Fig. 1 Incident cases and mortality rate of different cancers

3 .Fig. 2
Fig. 2 Advantages of transfer learning for lung cancer detection were selected and stacked to build a new prediction model.The model was fine-tuned repeatedly to improve the classification accuracy while reducing the required training time.The model was trained and validated to classify three cancer classes and a normal class.Finally, the model was tested.

Fig. 3
Fig.3Framework of the proposed methodology

T
P +T N T P +T N+F P +F N Gives the classification success of the model, where TP, TN, FP and FN denote true positives, true negatives, false positives, and false negatives.Gives the number of positive estimates that are correctly classified.F1-score 2×T P 2×T P +F P +F N Relates the sensitivity and precision measures.

cc 3 c=0 (S m c − 1 4 3 c=0 A m c ) 2 n
ic × log (y ic )] Measures how well a model is performing by comparing its predictions with the actual target values, where n is the number of samples (4), k is the number of classes (4), y ij is the true label (one-hot encoded), and ŷ ij is the predicted probability for class c.The arithmetic mean of the individual class for precision, recall, and f1-score, where c denotes classes 0 to 3 and m denotes either precision or recall or F1-score.Weighted averageThe arithmetic mean of the individual class multiplied by respective weights for precision, recall, and F1-score, where w 0 + w 1 + w 2 + w 3 = 1.Standard deviationDeviation of the values or data from an average mean for precision, recall, and F1-score, where n is the number of samples (4).

Fig. 7
Fig. 7 and loss in Fig. 8.The training curve suggests how well VER-Net is trained.It can be observed that both accuracy and loss for validation/testing converge approximately after 20 epochs.It is further noticed that the model did not exhibit significant underfitting and overfitting upon hyperparameter tuning.In our experiment, we tried with different epoch numbers (40, 60, 100, and 200).We got the best results with 100 epochs.

Fig. 8
Fig. 8 Training and validation/test loss VER-Net model

Fig. 10 Fig. 9
Fig. 10 Loss comparison of the proposed ensemble method (VER-Net) with other transfer learning models

Fig. 13
Fig. 13 Standard deviation for precision, recall and F1-score of all classes 8. The accuracy of VER-Net is compared with the state-of-the-art.

Table 1
Detailed sample-wise distribution of dataset before resampling

Table 2
Detailed sample-wise distribution of dataset after resampling in very deep neural networks.The output of densely connected and transition layers can be calculated using Eq. 1 and Eq. 2.
DenseNet201DenseNet201 [40] is a CNN with 201 layers.It is based on the DenseNet concept of densely connecting every layer to every other layer in a feedforward manner, which helps improve the flow of information and gradient propagation through the network.It is a part of the DenseNet family of models, designed to address the problem of vanishing gradients

: Building the VER-Net model Input:
Training dataset, validation dataset, test dataset, training epochs, input shape, and batch size.

Table 3
Description of the VER-Net model's architecture

Table 4
Hyperparameter settings of VER-Net

Table 5
Data augmentation details of VER-Net 6. Since there are four output classes, the confusion matrix is a 4 × 4 matrix.Every column in the matrix represents a predicted class, whereas every row represents an actual class.The principal diagonal cells denote the respective classes' correct predictions (TP).Besides the TP cell, all other cells in the same row denote TN.For example, in the first row, except the first column, five of the Adenocarcinoma were falsely classified as large cell carcinoma, and four were categorized as Squamous cell carcinoma.So, 9 (5 + 0 + 4) are TN classifications for the Adenocarcinoma class.Similarly, all other cells in the same column denote FP besides the TP cell.For example, in the first column, except the first row, four Large cell carcinoma, four normal cells, and 21 Squamous cell carcinoma are falsely classified as Adenocarcinoma.So, 29 (4 + 4 + 21) FN classifications exist for the Adenocarcinoma class.The rest of the cells denote FN predictions.

Table 6
System's hardware and software specifications for the experiment

Table 7
Performance evaluation metrics

Table 9
Comparing VER-Net with recent literature